A virtual ecological approach to modeling uncertainties in paleoecology

Quinn Asena

University of auckland, Cary Institute

George Perry

University of auckland

Janet Wilmshurst

Manaaki Whenua - Landcare Research

2025-11-18

Uncertainties in palaeoecology


The problem


Proxy data are the product of multiple sources of uncertainty


  • Environmental processes
    • bioturbation, taphonomy, variable sedimentation rates…
  • Field and laboratory methods
    • core collection methods, sub-sampling strategy, pollen counting…
  • Data processing methods
    • age-depth modelling, interpolation…

The question: is the past recoverable from the data?


Why it matters


  • Palaeoecology moving from descriptive to quantitative
  • Palaeoecology to inform the future requires robust statistical approaches
  • Advances in lab methods, data availability, and statistics are making more inferences possible

What we can do about it


  • One method to assess uncertainties is in simulation
  • We use pseudoproxy modelling / Virtual ecology

Approach


  • Simulate core samples containing proxies mimicing the statistical properties of empirical data
  • Simulate process and observer error that affect the data
  • Assess how statistical inferences are affected by process and observer error

Take home message


  • Proxy uncertainty is the counterpart to chronological uncertainty
  • Proxy uncertainty and chronological uncertainty are not separate and have combined effects

Key concepts

Virtual ecology

Virtual ecology is a framework for assessing sampling and analytical methods in simulation consisting of:

  1. An ecological model that generates synthetic data

    1a. a degradation model

  2. A simulated observational process (a sampling model) that samples the synthetic data

  3. An analytical process or statistical model applied to both sets of data

  4. An assessment of the results

Virtual ecology and empirical ecology


Perfect knowledege, imperfect world

  • Known drivers and responses
  • Known environmental and observational processes
  • Advantage of benchmark/control data
  • Advantage of replication
  • Able to systematically introduce uncertainty

Perfect world, imperfect knowledge

  • Sampled data with no benchmark/control
  • Advantage of being reality

Pseudoproxy experiments


Borrowing the term “pseudoproxies” from climatology:

  • Pseudoproxies are simulated data or modified observational data
  • Mimic the statistical properties of empirical data
  • Pseudoproxy experiments are similar to virtual ecology

Proxy system modelling


The process by which environmental change is recorded as an observable signal in an archive:

  1. Environmental drivers (e.g., climatic variability)
  2. A sensor (a component of the system that responds to the environmental drivers)
  3. An archive (the medium in which the response of the sensor is recorded)
  4. Observations drawn from the archive

Building the model

Let’s follow a singe replicate case-study

Simulating pseudoproxies


We set out to:

  • Represent multiple interacting drivers
  • Include underlying ecological dynamics (e.g., growth rates, niche breadth)
  • Generate a multi-species pseudoproxy record
  • Recreate core formation processes of accumulation rates and time-span
  • Virtually recreate the observational processes

Simulating pseudoproxies

Ruining pseudoproxies

Extending the proxy system model framework


  • Included a degradation (sub-)model to represent environmental processes

“Error-free” to degraded and sub-sampled

Example of 20/200 randomly selected species

Degraded and sub-sampled pseudoproxies

Analysing the outputs

Analyses


Ok, now we have generated the data, let’s analyse it. Two analyses:

  • Fisher Information
  • Principal curves

Demonstrating two scenarios with different driving environments

Analyses visualised

Recap!


  1. Environmental driver patterns over time (environment model)
  2. Species that respond to the drivers (sensor model – pseudoproxies)
  3. Core representation: accumulation rate and time-span (archive model)
  4. Core mixing (degradation model)
  5. Sampling and counting process (observation model)
  6. Analyse the pseudoproxies (assessment – Virtual Ecology)

Pervious slides followed:

  • 1 scenario
    • we simulated four different driving environments
  • 1 replicate
    • we simulated 30 replicates to account for stochasticity
  • 1 uncertainty level
    • we simulated 1210 (all combinations of mixing, sub-sampling, and proxy counting)

Question time

Extending to multiple scenarios and replicates


Each replicate results in 1210 datasets from the ‘error-free’ to the most uncertain 😱


Across replicates for each of the 1210 datasets:

  1. Extract features from the FI and PrC

    • feature analysis reduces the FI and PrC to one dimension
  2. Calculate the distance between each dataset from the ‘error-free’ to the most uncertain

  3. Make cool visulisations!

Think of it like this

  • Each column is a Fisher Information (or PrC) series that we saw earlier

Feature analysis

  • Feature Analysis for Time Series reduces the dimensions of the data to a set of comparable metrics (‘features’)
  • Extracted ‘features’ are compared across uncertainty levels using Euclidian distance

Assessing results (Virtual Ecology)

  • Each uncertainty applied individually to Fisher Information

Assessing results (Virtual Ecology)

  • Two uncertainties applied to fisher information

Question time

  • We have 3-dimensional uncertainty plots but maybe we have seen enough plots for now!

How does this help?

  • Understand the relative effects of uncertainties
  • Consider how we can mitigate uncertainty
  • Take advantage of replicates and ‘known’ conditions
  • Understand the effect of proxy uncertainty on the chronological placement of events

Application to empirical

Simulation methods can be integrated with empirical studies to:

  • a priori help shape field sampling methods: e.g., number of core samples (across a region or local replication) required for a given research question

  • understand the sub-sampling and count resolution required to increase the likelihood of detecting a hypothesised signal in the data

  • accompany empirical study to test hypotheses about the underlying dynamics that may cause an observed pattern in the data

  • assess whether inferences made from the data are robust to uncertainty

Outlook

  • Assessing error rates
  • Chronological uncertainty combined with proxy uncertainty
  • Extend underlying dynamics

Final question time!

“All models are wrong, some are useful” Box (1979)

Acknowledgements

  • George Perry (University of Auckland)

  • Janet Wilmshurs (Manaaki Whenua – Landcare Research)

  • Jack Williams (University of Wisconsin Madison)

  • Tony Ives (University of Wisconsin Madison)

  • Biological heritage Science Challenge (NZ) and the National Science Foundation (USA)

References

Asena, Quinn, George L. W. Perry, and Janet M. Wilmshurst. 2025. “Information Loss in Palaeoecological Data from Process and Observer Error.” EGUsphere, March, 1–31. https://doi.org/10.5194/egusphere-2024-3845.
Asena, Quinn, George LW Perry, and Janet M Wilmshurst. 2024. “Is the Past Recoverable from the Data? Pseudoproxy Modelling of Uncertainties in Palaeoecological Data.” The Holocene, 09596836241247304.
Blaauw, Maarten, K. D. Bennett, and J. Andrés Christen. 2010. “Random Walk Simulations of Fossil Proxy Data.” The Holocene 20 (4): 645–49. https://doi.org/10.1177/0959683609355180.
Box, G. E. P. 1979. “Robustness in the Strategy of Scientific Model Building.” In Robustness in Statistics, 201–36. Elsevier. https://doi.org/10.1016/B978-0-12-438150-6.50018-2.
Evans, M. N., S. E. Tolwinski-Ward, D. M. Thompson, and K. J. Anchukaitis. 2013. “Applications of Proxy System Modeling in High Resolution Paleoclimatology.” Quaternary Science Reviews 76 (September): 16–28. https://doi.org/10.1016/j.quascirev.2013.05.024.
Mann, Michael E., and Scott Rutherford. 2002. “Climate Reconstruction Using Pseudoproxies.” Geophysical Research Letters 29 (10): 139-1-139-4. https://doi.org/10.1029/2001GL014554.
Williams, John W., Jessica L. Blois, and Bryan N. Shuman. 2011. “Extrinsic and Intrinsic Forcing of Abrupt Ecological Change: Case Studies from the Late Quaternary.” Journal of Ecology 99 (3): 664–77. https://doi.org/10.1111/j.1365-2745.2011.01810.x.
Zurell, Damaris, Uta Berger, Juliano S. Cabral, Florian Jeltsch, Christine N. Meynard, Tamara Münkemüller, Nana Nehrbass, et al. 2010. “The Virtual Ecologist Approach: Simulating Data and Observers.” Oikos 119 (4): 622–35. https://doi.org/10.1111/j.1600-0706.2009.18284.x.

Extras

Think of it like this

  • Fisher Information and PrC series

Think of it like this

  • Each column is a Fisher Information (or PrC) series

Think of it like this

  • Now, each column is a ‘feature’ describing the series

Think of it like this

  • Now we have a distance matrix of every combination of uncertainty

Assessing results